Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification
نویسندگان
چکیده
This paper studies bandit algorithms under data poisoning attacks in a bounded reward setting. We consider strong attacker model which the can observe both selected actions and their corresponding rewards, contaminate rewards with additive noise. show that any algorithm regret O(log T) be forced to suffer O(T) an expected amount of contamination T). is also necessary, as we prove there exists algorithm, specifically classical UCB, requires Omega(log Omega(T). To combat such attacks, our second main contribution propose verification based mechanisms, use limited access number uncontaminated rewards. In particular, for case unlimited verifications, simple modified version Explore-then-Commit type restore order optimal irrespective used by attacker. provide UCB-like scheme, called Secure-UCB, enjoys full recovery from verifications. derive matching lower bound on order-optimal this verifications necessary recover regret. On other hand, when above budget B, novel Secure-BARBAR, provably achieves O(min(C,T/sqrt(B))) high probability against weak attackers (i.e., who have place before seeing actual pulls algorithm), where C total attacker, breaks known Omega(C) non-verified setting if large.
منابع مشابه
Data Poisoning Attacks against Autoregressive Models
Forecasting models play a key role in money-making ventures in many different markets. Such models are often trained on data from various sources, some of which may be untrustworthy. An actor in a given market may be incentivised to drive predictions in a certain direction to their own benefit. Prior analyses of intelligent adversaries in a machine-learning context have focused on regression an...
متن کاملCertified Defenses for Data Poisoning Attacks
Machine learning systems trained on user-provided data are susceptible to data poisoning attacks, whereby malicious users inject false training data with the aim of corrupting the learned model. While recent work has proposed a number of attacks and defenses, little is understood about the worst-case loss of a defense in the face of a determined attacker. We address this by constructing approxi...
متن کاملGenerating random media from limited microstructural information via stochastic optimization
Random media abound in nature and in manmade situations. Examples include porous media, biological materials, and composite materials. A stochastic optimization technique that we have recently developed to reconstruct realizations of random media ~given limited microstructural information in the form of correlation functions! is investigated further, critically assessed, and refined. The recons...
متن کاملStochastic Rank-1 Bandits
We propose stochastic rank-1 bandits, a class of online learning problems where at each step a learning agent chooses a pair of row and column arms, and receives the product of their values as a reward. The main challenge of the problem is that the individual values of the row and column are unobserved. We assume that these values are stochastic and drawn independently. We propose a computation...
متن کاملSparse Stochastic Bandits
In the classical multi-armed bandit problem, d arms are available to the decision maker who pulls them sequentially in order to maximize his cumulative reward. Guarantees can be obtained on a relative quantity called regret, which scales linearly with d (or with √ d in the minimax sense). We here consider the sparse case of this classical problem in the sense that only a small number of arms, n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2022
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v36i7.20777